Method and apparatus for agent optimization using speech synthesis and recognition

ABSTRACT

A system and method of automatic call handling allowing agent optimization in an automatic call distributor using voice recognition and speech synthesis technology. A speech synthesizer takes a script as input and generates speech as output. A prepared script includes speech input for the speech synthesizer. Connect a call with a call contact, speak to the call contact speech generated using the prepared script as input to the speech synthesizer, receive live agent voice input, and recognize agent speech.

[0001] This application relates generally to artificial intelligenceusing voice recognition and speech synthesis. More specifically, thisapplication relates to a method and apparatus for automatic agents in anautomatic call center using speech recognition and speech synthesistechnology.

BACKGROUND

[0002] Call centers provide important solutions to customer contactproblems in electronic commerce. An example of such a system is theSpectrum® Integrated Call Center System available from RockwellElectronic Commerce, global provider of proven customer contactsolutions for voice and Internet media. The Automatic Call Distributor(ACD) is a focal point of a customer contact center. The “calls” handledby an ACD can be any customer contact including but not limited totelephone calls, wireless calls, facsimile, e-mails, internet chat, andother internet like contacts. Thus, an ACD can accept customer voice anddata contacts and distribute each to a properly skilled agentrepresentative, and can be implemented on any structure using hardware,software or a combination of both in for example, an integrated system,a distributed system using multiple computers, or a software drivensolution using a personal computer.

[0003] Call center administrators can set up a call flow or call flowsfor an ACD platform. Tools for managing call flows include scripts forrouting calls. Such scripts can be programmed over a general-purposecomputer included in an ACD platform.

[0004] Call routing allows a call center to accept each customer contactand distribute it to the proper agent representative. It is known toroute calls using an icon-based Windows application tool. Using such atool, a user can create a series of steps to route a customer contactefficiently to a desired agent. Scripts are used to direct calls toAgent Groups, Intelligent Announcements, ACD Mail, VRU ports and othercall centers. Scripts can provide the call center with direct controlcall-by-call over interactions with the telephone network.

[0005] Existing technology requires that an individual agent handle eachseparate call. Under this existing technology, an individual agent musthandle each call even if the call contact asks the same question andshould receive the same responses as many other callers.

[0006] Known systems for handling calls in an automated call centerrequire an extensive staff of agents, because an individual agent mustparticipate in each call. Such individual call handling can be less thanoptimally efficient if a number of different call contacts tend to posethe same questions and should receive the same responses from agents. Inaddition to inefficiency, the quality of individual call handling can beaffected by individual agent fatigue or by inadvertent errors oromissions by individual agents.

[0007] Current technology only allows for one agent per call. Thisrequires an agent to handle each call even if the same questions orresponses are given to or from one call contact as another call contact.If the agent is consistently responding to many callers in the same wayit would be desirable to automate the process and only require an agentto be involved when something unexpected or unique happens during thecall.

SUMMARY

[0008] The invention, together with the advantages thereof, may beunderstood by reference to the following description in conjunction withthe accompanying figures, which illustrate some embodiments of theinvention.

[0009] One mode of practicing the invention is a method of automaticcall handling allowing agent optimization in an automatic calldistribution system that comprises synthesizing speech by using a scriptas input and generating speech as output, connecting a call from or to acall contact, and speaking to the call contact using speech generatedusing the prepared script as input. The method further comprisesidentifying any need to use a live agent to respond to call contact,making an extemporaneous script using an agent speech recognizer torecognize live agent voice input and generating speech for transmissionto the call contact. The agent speech recognizer can, in variousembodiments, perform speech-to-text conversion, speaker dependentrecognition, continuous-speech recognition, contiguous-word recognition,and isolated-words recognition. The speech synthesizer can also includeadditional steps to record different pronunciations from a professionalvoice, save the different pronunciations recorded from a professionalvoice, and use the different pronunciations recorded from a professionalvoice to synthesize speech.

[0010] A second mode of practicing the invention is an agent assistmethod of automatic call handling in an automatic call distribution thatincludes the steps of synthesizing speech by using a proposed script asinput and generating as output speech that sounds like a live agent'snatural voice, connecting a call with a call contact, and transmittingsynthesized speech generated using the prepared script as input to thecall contact. The method further comprises receiving call contact voiceinput, transmitting the call contact voice input to the live agent, andaccepting direction from the live agent indicating a response to thecall contact's voice input. Responses to the call contact's voice inputcan in some embodiments include providing a second prepared script andconnecting the call to the live agent for the live agent to conversewith the call contact. One embodiment of this mode may include steps tomonitor background noise surrounding the live agent and then introducebackground noise into the synthesized speech to mimic the live agent'senvironment.

[0011] Other modes of practicing the invention include a system with anautomatic call distributor that has means to perform the steps of themethod summarized above. Yet another mode of practicing the invention isa computer program product embedded in a tangible medium of expressionthat includes computer program code to perform the steps summarizedabove.

REFERENCE TO THE DRAWINGS

[0012] Several aspects of the present invention are further described inconnection with the accompanying drawings in which:

[0013]FIG. 1 is a block diagram illustrating one embodiment of a systemincluding an automatic call distributor for agent optimization usingspeech synthesis and recognition.

[0014]FIG. 2 is a flowchart illustrating the flow of data and control ofprocesses in an embodiment of a method of agent optimization usingspeech synthesis and recognition.

[0015]FIG. 3 is a flowchart illustrating the flow of data and control ofprocesses in another embodiment of a method of agent optimization usingspeech synthesis and recognition.

[0016]FIG. 4 is a flowchart illustrating the flow of data and control ofprocesses in an embodiment of a method of agent optimization usingspeech synthesis and recognition.

[0017]FIG. 5 is a flowchart further illustrating the flow of data andcontrol of processes in an embodiment of a method of agent optimizationusing speech synthesis and recognition.

[0018]FIG. 6 is a flowchart illustrating the flow of data and control ofprocesses in an embodiment of a method of agent optimization usingspeech synthesis and recognition.

DETAILED DESCRIPTION

[0019] While the present invention is susceptible of embodiment invarious forms, there is shown in the drawings and will hereinafter bedescribed some exemplary and non-limiting embodiments, with theunderstanding that the present disclosure is to be considered anexemplification of the invention and is not intended to limit theinvention to the specific embodiments illustrated.

[0020] Several types of embodiments of the invention are describedherein, including a first mode which provides for handling callsautomatically, and a second mode which provides for assisting an agentand helping to optimize the agent's time. These modes use speechrecognition and speech synthesis technology. Particular embodiments mayuse text-to-speech synthesis and speech-to-text conversion.

[0021] In one embodiment, a call center creates one or a group of“virtual agents.” Different pronunciations from a professional voice canbe recorded and saved to be used when piecing together different scriptsto create sounds as if a human voice is speaking. Speech recognition andspeech synthesis, or in one embodiment speech-to-text conversion andtext-to-speech conversion, can be used to allow the live agent to soundlike the “virtual agent.”

[0022] When an agent replies to a call contact using the agent's ownvoice the call contact may detect that the call contact has beenpreviously talking to a machine, which some call contacts may dislike.It may therefore be desirable to mask the voice of the agent so that theagent's voice matches the voice of the “virtual agent.” The virtualagent can mask the live agent's voice by having what the agent saystranslated to text by speech recognition such as, in one embodiment,speech-to-text conversion. The text produced can then be spoken to thecaller by using the same speech synthesis technology, which in oneembodiment can be text-to-speech technology, used when the originalscript was spoken. In this way agents can sound like any of the “virtualagents” created by the call center and a particular live agent is nolonger tied to a call.

[0023] In another embodiment, the agent's speech can be translated, notinto natural language text, but instead into some phonetic or otherintermediate representation of the agent's speech. This representationcan then be supplied as input for speech synthesis. This embodiment mayhave the advantage of not requiring as sophisticated a speechrecognition technology as would be required to produce textual output.Any other speech transformation technique which connects to the speechof the agent so that it sounds like that of the virtual agent can alsobe used.

[0024] In some embodiments, a call center administrator can set up acall flow including different scripts that can be spoken/read to thecall contact via text-to-speech conversion and a list of valid responseswith which the call contact is expected to respond. When the callcontact responds to a script his/her response can be evaluated usingspeech recognition technology, which allows the computer to decide on anappropriate action. In one embodiment of this mode of practicing theinvention, the action can be to save the call contact's response and tospeak another appropriate script to the call contact in order to gatherfurther information. In this way a call may be handled and routedthrough different text scripts without human interaction.

[0025] In some embodiments this automatic routing works only for theknown set of responses programmed by the call center administrator. Aconnection to a live agent may be needed if the call contact gives anunexpected response to one of the scripts. When a call contact gives anunexpected response, that response can be saved, in one embodiment,allowing the information to be displayed to the agent when the agentreceives the call. The information can also be displayed so that thecall center administrator will be able to evaluate it to determine if itis appropriate to add this response and a reference to a new or existingscript to the call flow specification. This evaluation can be done atany time, whether during the call of after the call is terminated.

[0026] The computer in one embodiment can respond to the call contact,indicating that the call contact's response was not understood, while itroutes the call to a live agent to be handled correctly. For example:the computer can be programmed to respond, “Sorry I did not completelyhear what you said, could you please repeat it again?” This response canallow time for the call to be routed to an agent. The call contact canthen repeat the unknown response with the live agent listening to thecall. When the call is routed to the agent in this embodiment, thescript text that the computer was speaking when the unexpected responsewas given can also be sent to the agent. The text already spoken can behighlighted so that the agent will know what has been spoken to the callcontact. In one embodiment, textual representations of valid responsesthat the call contact has given to earlier scripts can also be providedso that the agent can review what has transpired in the call. A bar canalso be displayed to the agent, representing where the call is in thecall flow. This information can give the agent a rough idea of what isbeing discussed with the call contact if the agent is familiar with thecall flow being spoken. This bar can also be used in some embodiments tosend the call contact back to an appropriate place within the call flowwhen the live agent is finished. This returning can be done, forexample, by pointing with the mouse to the section of the call flow tocontinue the call and then clicking, thereby sending the caller back tothe appropriate text to speech section and freeing up the agent to speakto another live person.

[0027] The table below recites an example of a dialog using oneembodiment.

[0028] An outbound call has been completed to a live person.

[0029] Text-to-speech: Hello, may I please speak with Mr. Doe?

[0030] Call Contact: Please wait while I get him.

[0031] Call Contact: Hello, this is Mr. Doe.

[0032] Text-to-speech: Hello Mr. Doe this is Jon with Alexander Bell.

[0033] How are you doing today?

[0034] Call Contact: Fine.

[0035] Text-to-speech: Good. I am calling you today to talk to you aboutour new local service offer.

[0036] More text/speech explaining local service is given . . . .

[0037] To switch you to this new plan all I need is your permission.

[0038] Call Contact: Don't you offer long distance service also, and howcan I bundle local and long distance together?

[0039] Unexpected Response—forward the call to an agent.

[0040] Text-to-Speech: I'm sorry I did not completely hear yourquestion, could you please say it again?

[0041] Call Contact: Don't you offer Long distance service also, and howcan I bundle local and long distance together?

[0042] Agent: Sorry, we don't offer long distance yet.

[0043] Completes response and then sends the live person back to thecomputer.

[0044] Text-to-Speech: Thank you. Alexander Bell appreciates yourbusiness.

[0045] In another embodiment much of the same technology is used. Thisembodiment allows the agent to record different pronunciations ofdifferent sounds and save them to be used by a speech synthesis systemand/or a text-to-speech conversion program when piecing togetherdifferent scripts to create sounds as if a human voice is speaking.Thus, it sounds as if an individual agent is speaking instead of the“virtual agent” as discussed above.

[0046] When an agent accepts a call connection, an appropriate scriptcan be spoken/read to the call contact in the agent's own voice by thespeech synthesis system. At this point the agent can listen to and canhear the presentation to the call contact and all responses given. Theagent in one embodiment can have the ability to choose and direct whichscript is spoken to the other party, or the ability to break into theconversation and take over the handling of the call when needed. Sincethe agent in such an embodiment only needs to be a passive participanton the call, the agent may be able to perform other functions whilehandling the call unless and until the live agent needs to speak to thecall contact for some reason. For example: the agent may be able tomonitor the call and perform other work because the call does notrequire the agent's full attention.

[0047] Allowing the agent to be a silent participant in the conversationmay in some embodiments have advantages because the agent knows what ishappening on the call if the agent needs to intervene. The agent alreadyknows information about the call instead of attempting to learn about itimmediately after the call is delivered or rerouted.

[0048] In one embodiment, this mode can also monitor the backgroundnoise surrounding the live agent. This measurement allows similar noiseto be introduced into the call at a similar level when the synthesizedspeech (or, in one embodiment, text-to-speech) is being spoken to thecall contact. Background noise can be prerecorded, or the actual ambientsound around the agent can be used. This introduction of backgroundnoise can reduce any perception of being “handed off” to a live agentwhen the live agent takes over.

[0049] Referring now to FIG. 1, there is disclosed a block diagram ofone embodiment of a system including an automatic call distributor forusing speech recognition and synthesis in call handling or to assistagents. An automatic call distributor (115) can connect calls through anexternal telephonic network (140) to call contacts (105A, 105B, 105C).The automatic call distributor (115) is connected to a live agentstation (120), including manual input (122) such as a keyboard, audioinput (124) such as a microphone, and audio output (126) such asheadphones. The live agent station (120) in the embodiment depicted canalso include a CPU (128), clock (130), RAM memory (132), a display (134)such as a CRT, and non-volatile memory (136) such as a hard disk. A liveagent (190) operates the live agent station (120).

[0050] Referring still to the embodiment of FIG. 1, the automatic calldistributor (115) is also connected to a virtual agent (160). Thedepicted embodiment of a virtual agent (160) includes sound output(162), sound input (164), a CPU (166), a memory (168) and a clock (170).The depicted embodiment of a virtual agent (160) also includes modulesfor speech recognition (172), such as speech-to-text conversion, and forspeech synthesis (174), such as text-to-speech conversion. Analternative embodiment (not pictured) may include a second module forspeech recognition, e.g., speaker independent speech recognition. Thedepicted embodiment of a virtual agent (160) can also include storagefor a call record (178). The depicted embodiment of a virtual agent(160) includes or is connected to a call flow specification database(180) including one or more scripts (182) having associated responses(184A, 184B 184C) and actions (186A, 186B, 186C) associated with thoseresponses. The virtual agent (160) can be operably connected to the liveagent station both directly and also through the automatic calldistributor (115).

[0051] In one embodiment of an automatic call distributor (115), callscan be placed automatically by the automatic call distributor systemitself. If there is no answer, the number is not in service, or the callotherwise fails to connect then the automatic call distributor canproceed to the next number without connecting the call to an agent. Ifthe call is connected, the call can then be assigned to a human agent orto a virtual agent as described in the various modes of practicingdifferent embodiments of the present invention set forth above.

[0052] Alternatively, calls placed from outside persons depicted here ascall contacts (105A, 105B, 105C) can be received by the automatic calldistributor. When a human agent or a virtual agent becomes available,this incoming call can be assigned to that available human agent orvirtual agent.

[0053] Referring now to FIG. 2, there is illustrated a system flow chartdepicting the control of operations and the data flow in an embodimentof a system for agent optimization using voice recognition and text tospeech. Script data (205) is provided for the system to use. Control ofthe system passes first to a read script process (210), which generatesvoice output using the script data (205) provided. Control passes nextto a receive spoken response process (220) that records the actualresponse (215) of a call contact after the script has been read.Expected responses data (230) is also provided. After the receive spokenresponse process (220) receives the actual response (215) from the callcontact, control passes next to a compare actual response to expectedresponse process (225). The system then determines whether the actualresponse (215) was an expected response (230). In the affirmative, ifthe actual response (215) was an expected response (230), control passesnext to a handle expected response process (245). The handle expectedresponse process (245) can record data, determine a follow-up script toread, write information offline, or perform other functions. In thenegative if the actual response (215) was not an expected response(230), control passes to a handle unexpected response process (240). Thehandle unexpected response process (240) can transfer control to a liveagent, write information offline, or perform other functions. After theresponse has been processed, the system will determine whether the callhas been terminated (250). If the call has been terminated, thealgorithm ends, and can be re-invoked when another call is placed. Ifthe call has not been terminated, control passes back to the read scriptprocess (210) which will read another provided script (205).

[0054] Referring now to FIG. 3, there is depicted a system flowchart ofanother embodiment of a method for agent optimization using speechrecognition and text to speech. A set of scripts (305) are provided. Thesystem first calls a process script process (310), which generatesartificial speech then calls a send artificial speech to call contactprocess (320), a send artificial speech to agent process (325), and adisplay script text for agent process (330). After performing theseprocesses, control passes next to a receive voice input from callcontact process (335), which accepts voice input from the call contactwith whom the telephone call was connected. Control passes next to aplay voice input for agent process (340). In one mode of practicing thisembodiment, voice input can be played for the agent at the same time itis received from the call contact. Control passes next to an agentinstruction process (345) in which the system accepts instruction fromthe agent about the next action to be performed in response to the voiceinput. The agent instruction process (345) can take input from the agentin the form of, for example, keyboard entry or voice data input. Controlpasses next to an action branch process (350), which performs an actionbased on the agent instruction (345). As one example of such an action,the system can pass control to a play another script process (360) whichpasses control back to the process script process (310). As a secondexample of such an action, control can pass to an agent interventionprocess (370), which can permit the agent to interact directly with thecall contact. After the agent intervention process (370) control passesnext to a handle agent action process (380). Control can then pass backto the agent instruction process (345). As a third example of an actionthat can be performed in response to agent instruction, the system caninvoke a terminate call process (390), which will terminate the call andhang up the telephone.

[0055] Referring now to FIG. 4, there is depicted another embodiment ofa method for agent optimization using speech recognition and text tospeech. The method first invokes a connect call process (410), whichwill connect the call from, for example, an automatic call distributorto a call contact. Control passes next to a speak prepared scriptprocess (420), which takes input from prepared script data (415) andgenerates as output artificial speech which is transmitted over thetelephone lines to the call contact. The system can next perform aprocess such as receiving voice input from the call contact (notpictured). Control passes then to a receive live agent voice inputprocess (430), in which the live agent speaks into an apparatus (suchas, for example, a microphone) a reply appropriate to the response ofthe call contact to the prepared script. Upon receiving live agent voiceinput, control passes to a make extemporaneous script process (440) inwhich the live agent voice input is converted to synthesized speech topreserve the call continuity and avoid the perception that a differentperson is speaking. The make extemporaneous script process (440) createsextemporaneous script data (425), which may be in the form of text orsome other convenient form such as phonetic data. Control passes next toa speak extemporaneous script process (450), in which the extemporaneousscript data (425) is spoken to the call contact.

[0056] Referring now to FIG. 5, there is depicted another embodiment ofa method and apparatus for agent optimization using voice recognitionand text to speech. Within the illustrated embodiment, a recordprofessional voice process (510) records samples of speech from aprofessional voice. A save different pronunciations process (520) thensaves pronunciations from one or more professional voices. A usecollected pronunciations for speech synthesizer process (530) then usesthose saved pronunciations to generate an artificial voice in a speechsynthesis module. Processes 510, 520, and 530 for preparing anartificial voice can be performed separately from the other processesdepicted in FIG. 5. A connect call process (540) connects a call with acall contact using, for example, an automatic call distributor. Afterthe call has been connected, a speak prepared script process (550) willuse prepared script data (555) assembled by, for example, a supervisor,to communicate a selected message to a call contact. After the preparedscript is spoken, the system can then perform other process, notpictured, such as receiving voice input from the call contact, whichalso can be monitored by a live agent. A receive live agent voice inputprocess (560) accepts voice input from a live agent by, for example, amicrophone. A make extemporaneous script process (570) then takes thelive agent voice input and converts it to extemporaneous script data(575). Extemporaneous script data (575) can be in the form of text orsome other data reflecting speech. A speak extemporaneous script process(580) then generates artificial speech using the extemporaneous scriptdata (575).

[0057] Referring now to FIG. 6, additional detail illustrating anotherembodiment is shown. The processes that appeared in FIG. 5 contain thesame reference numbers. After the speak prepared script process (550),control passes next to a get call contact voice input process (610).Control passes next to a compare input to expected response process(620), which makes its comparison based on call flow specification data(615) that can be provided, for example, during system set up. Controlpasses next to a was input-expected decision (625). If the input is anexpected response, control passes next to a perform associated actionprocess (640). In another embodiment not pictured, an expected responsecan also be written to call history data (660) for possible later use.After the perform associated action process (640), control passes nextto a call termination decision (650). If the call is terminated, theprocess ends. If the call is not terminated, control passes back to thespeak prepared script process (550), which reads a new scriptappropriate to the previous caller's response. If the was-input-expecteddecision (625) determines that the input was not expected, controlpasses to a record response process (630) which records the unexpectedresponse to a call history. Unexpected responses in the call history canbe used to update the call flow specification (615) with additionalappropriate scripts. Control passes then to a route call to agentprocess (635), which connects the call to a live agent. The live agentprovides a spoken reply to the unexpected response received from thecall contact, control of which is handled by the receive live agentvoice input process (560), the make extemporaneous script process (570)and the speak extemporaneous script process (580), all described abovein connection with FIG. 5. After the extemporaneous script has beenspoken using artificial speech, the agent can continue to communicatewith the call contact using voice synthesization until such time as thecall is terminated or until the agent is able to identify anotherappropriate script to be read.

[0058] An advantage of some embodiments may be to reduce agent staffingrequirements. Depending on the complexity and sophistication of the callcenter's call flow, the number of agents needed to support a call centermay be greatly reduced.

[0059] Another advantage of some embodiments may be to allow for a moreconsistent and accurate delivery of the presentation given to the callcontact. The synthetic speech will never sound fatigued or skipimportant sections of the script being spoken to the call contact. Also,some embodiments can remove any distracting background noise from thecall. When a script is being spoken to a call contact no backgroundnoise need be present and when an agent is having his/her voice filteredthrough speech-to-text and text to speech again to sound as if the“virtual agent” is talking all background noise can be removed.

[0060] Another advantage of some embodiments may be to allow a morecareful tracking of what is said by the call contact. Since the “virtualagent's” voice is a known voice pattern that will always sound the same,it can be filtered out of any transcripts or recordings taken of thecall. This will remove any confusion over what the live party has saidin the transaction.

[0061] Another advantage of some embodiments including speech-to-textconversion or text-to-speech conversion may be to allow the call centeradministrator to change or enhance the script that is spoken to thecaller. After the administrator changes the scripts the changes can takeeffect whenever the administrator chooses.

[0062] While the present invention has been described in the context ofparticular exemplary data structures, processes, and systems, those ofordinary skill in the art will appreciate that the processes of thepresent invention are capable of being distributed in the form of afunctional unit, or software for an information processing system.

[0063] A functional unit is an entity of hardware or software, or both,capable of accomplishing a specified purpose. Hardware is all or part ofthe physical components of an information processing system. Softwareincludes all or part the programs, procedures, rules, and associateddocumentation of an information processing system. An informationprocessing system is one or more data processing systems and devices,such as office and communication equipment, that perform informationprocessing. A data processing system includes one or more computers,peripheral equipment, and software that perform data processing.

[0064] A computer is a functional unit that can perform substantialcomputations, including numerous arithmetic operations and logicoperations without human intervention. A computer can consist of astand-alone unit or can comprise several interconnected units. Ininformation processing, the term computer usually refers to a digitalcomputer, which is a computer that is controlled by internally storedprograms and that is capable of using common storage for all or part ofa program and also for all or part of the data necessary for theexecution of the programs; performing user-designated manipulation ofdigitally represented discrete data, including arithmetic operations andlogic operations; and executing programs that modify themselves duringtheir execution. A computer program is syntactic unit that conforms tothe rules of a particular programming language and that is composed ofdeclarations and statements or instructions needed to solve a certainfunction, task, or problem. A programming language is an artificiallanguage (a language whose rules are explicitly established prior to itsuse) for expressing programs.

[0065] Software for an information processing system can be stored asinstructions and the like on a computer readable medium in a variety offorms. The present invention applies equally regardless of theparticular type of signal bearing computer readable media actually usedto carry out the distribution. Computer readable media includes anyrecording medium in which computer code may be fixed, including but notlimited to CD's, DVD's, semiconductor RAM, ROM, or flash memory, papertape, punch cards, and any optical, magnetic, or semiconductor recordingmedium or the like. Examples of computer readable media includerecordable-type media such as floppy disk, a hard disk drive, a RAM, andCD-ROMs, DVD-ROMs, an online internet web site, tape storage, andcompact flash storage, and transmission-type media such as digital andanalog communications links, and any other volatile or non-volatile massstorage system readable by the computer. The computer readable mediumincludes cooperating or interconnected computer readable media, whichexist exclusively on single computer system or are distributed amongmultiple interconnected computer systems that may be local or remote.Many other configurations of these and similar components (which canalso comprise computer system) are considered equivalent and areintended to be encompassed within the scope of the claims herein.

[0066] Although embodiments have been shown and described, it is to beunderstood that various modifications and substitutions, as well asrearrangements of parts and components, can be made by those skilled inthe art, without departing from the scope of the invention defined inthe appended claims. Therefore, the spirit and scope of the appendedclaims should not be limited to the description of the embodimentscontained herein. Although the appended claims do not recite everydetail described above in illustrated embodiments, the claims areintended to cover that subject matter also, in addition to the subset ofthat subject matter recited below as limitations. The entire subjectmatter is therefore claimed and not dedicated to the public. Theappended claims are contemplated to cover the present invention and anyand all modifications, variations, or equivalents that fall within thetrue spirit and scope of the basic underlying principles disclosed andclaimed herein.

I claim:
 1. A method of automatic call handling allowing agent optimization in an automatic call distribution system, the method comprising: synthesizing speech using a prepared script as input to generate synthesized speech; connecting a call with a call contact; speaking to the call contact using speech synthesized using the prepared script; receiving live agent voice input; making an extemporaneous script using automated agent speech recognition to recognize the live agent voice input; and generating synthesized speech for transmission to the call contact using the extemporaneous script. The method of automatic call handling allowing agent optimization according to claim 1 wherein the speech synthesizing is a performed using text-to-speech synthesize.
 2. The method of automatic call handling allowing agent optimization according to claim 1 wherein the automated agent speech recognition is speech-to-text conversion.
 3. The method of automatic call handling allowing agent optimization according to claim 1 wherein the automated agent speech recognition is speaker dependent recognition.
 4. The method of automatic call handling allowing agent optimization according to claim 1 wherein the automated agent speech recognition is recognition selected from the group consisting of continuous-speech recognition, contiguous-words recognition and isolated-words recognition.
 5. The method of automatic call handling allowing agent optimization according to claim 1, further comprising: recording different pronunciations from a professional voice; saving the different pronunciations recorded from a professional voice; and using the different pronunciations recorded from a professional voice to synthesize speech.
 6. The method of automatic call handling allowing agent optimization according to claim 1, further comprising: receiving call contact voice input in response to the prepared script; comparing the call contact voice input to a valid response associated with the prepared script by recognizing call contact speech; and if the call contact voice input is recognized as the valid response then performing an appropriate action associated with the valid response.
 7. The method of automatic call handling allowing agent optimization according to claim 6 wherein the recognizing call contact speech performs recognition selected from the group consisting of continuous-speech recognition, contiguous-words recognition and isolated-words recognition.
 8. The method of automatic call handling allowing agent optimization according to claim 6 wherein recognizing call contact speech is selected from the group consisting of a text-dependent recognizer and a text-independent recognizer.
 9. The method of automatic call handling allowing agent optimization according to claim 6 wherein recognizing call contact speech uses a recognition vocabulary including the valid response.
 10. The method of automatic call handling allowing agent optimization according to claim 6 wherein recognizing call contact speech uses a speaker-dependant system.
 11. The method of automatic call handling allowing agent optimization according to claim 6, further comprising: if the call contact voice input is not the valid response then routing the call to the live agent.
 12. The method of automatic call handling allowing agent optimization according to claim 11 wherein routing the call to a live agent includes asking the call contact to repeat the call contact voice input that was not recognized as the valid response.
 13. The method of automatic call handling allowing agent optimization according to claim 11 wherein routing the call to a live agent includes sending the prepared script to the live agent.
 14. The method of automatic call handling allowing agent optimization according to claim 6, wherein: a plurality of valid responses were associated with the prepared script; and an appropriate action is associated with each valid response.
 15. A virtual agent for live agent optimization in an automatic call distribution system, the virtual agent comprising: means for synthesizing speech a prepared script as input and generating synthesized speech; means for connecting a call with a call contact; means for speaking to the call contact using synthesized speech generated using the prepared script; means for receiving live agent voice input; means for making an extemporaneous script using agent speech recognition to recognize the live agent voice input; and means speaking to the call contact using speech generated using the extemporaneous script as input to the means for synthesizing speech.
 16. The virtual agent for live agent optimization in an automatic call distribution system according to claim 15, wherein the means for synthesizing speech is a text-to-speech synthesizer.
 17. The virtual agent for live agent optimization in an automatic call distribution system according to claim 15, wherein the agent speech recognition comprises speech-to-text conversion.
 18. The virtual agent for live agent optimization in an automatic call distribution system according to claim 15, wherein the agent speech recognition comprises speaker dependent recognition.
 19. The virtual agent for live agent optimization in an automatic call distribution system or according to claim 15, wherein the agent speech recognition comprises recognition selected from the group consisting of continuous-speech recognition, contiguous-words recognition and isolated-words recognition.
 20. The virtual agent for live agent optimization in an automatic call distribution system according to claim 15, further comprising: means for recording different pronunciations from a professional voice; means for saving the different pronunciations recorded from a professional voice; and means for using the different pronunciations recorded from a professional voice to synthesize speech.
 21. The virtual agent for live agent optimization in an automatic call distribution system according to claim 15, further comprising: means for receiving call contact voice input in response to a prepared script; a means for comparing the call contact voice input to a valid response associated with the prepared script, said means for comparing using call contact speech recognition; and a means for performing an appropriate action associated with the valid response if the call contact voice input is recognized as the valid response,
 22. The virtual agent for live agent optimization in an automatic call distribution system according to claim 21 wherein the call contact speech recognition is selected from the group consisting of continuous-speech recognition, contiguous-words recognition and isolated-words recognition.
 23. The virtual agent for live agent optimization in an automatic call distribution system according to claim 21, wherein the call contact speech recognition is selected from the group consisting of text-dependent recognition and a text-independent recognition.
 24. The virtual agent for live agent optimization in an automatic call distribution system according to claim 21 wherein the call contact speech recognition uses a recognition vocabulary including the valid response.
 25. The virtual agent for live agent optimization in an automatic call distribution system according to claim 21 wherein the call contact speech recognition is speaker-dependant.
 26. The virtual agent for live agent optimization in an automatic call distribution system according to claim 21 further comprising: means for routing the call to the live agent if the call contact voice input is not the valid response.
 27. The virtual agent for live agent optimization in an automatic call distribution system according to claim 26 wherein routing the call to a live agent includes asking the call contact to repeat the call contact voice input that was not recognized as the valid response.
 28. The virtual agent for live agent optimization in an automatic call distribution system according to claim 26 wherein routing the call to a live agent includes sending the prepared script to the live agent.
 29. The virtual agent for live agent optimization in an automatic call distribution system according to claim 21, wherein the plurality of valid responses are associated with the prepared script and an appropriate action is associated with each valid response.
 30. A virtual agent for live agent optimization in an automatic call distribution system, the virtual agent comprising: a speech synthesizer; a prepared script, applied as input to the speech synthesizer to generate synthesized speech; a telephonic network connecting a call with a call contact; a microphone that receives live agent voice input; an agent speech recognizer that makes an extemporaneous script from the live agent voice input; and a communications channel that conveys to the call contact the synthesized speech generated using the prepared script as input to the speech synthesizer and speech generated using the extemporaneous script as input to the speech synthesizer.
 31. The virtual agent for live agent optimization in an automatic call distributor according to claim 29 wherein the speech synthesizer is a text-to-speech synthesizer.
 32. The virtual agent for live agent optimization in an automatic call distributor according to claim 29 wherein the agent speech recognizer performs speech-to-text conversion.
 33. The virtual agent for live agent optimization in an automatic call distributor according to claim 29 wherein the agent speech recognizer performs speaker dependent recognition.
 34. The virtual agent for live agent optimization in an automatic call distributor according to claim 29 wherein the agent speech recognizer performs recognition selected from the group consisting of continuous-speech recognition, contiguous-words recognition and isolated-words recognition.
 35. The virtual agent for live agent optimization in an automatic call distributor according to claim 29, wherein the speech synthesizer comprises: a recorder that records different pronunciations from a professional voice; a computer readable medium on which the different pronunciations recorded from a professional voice are saved; and a synthesizer that combines the different pronunciations recorded from a professional voice to synthesize speech.
 36. The virtual agent for live agent optimization in an automatic call distributor according to claim 29, further comprising: call contact speech recognizer; a valid response associated with the prepared script; an appropriate action associated with the valid response; the communications channel receiving call contact voice input in response to the synthesized speech; a processor that compares the call contact voice input to the valid response using the call contact speech recognizer and performs the appropriate action associated with the valid response if the call contact voice input is recognized as the valid response.
 37. The virtual agent for live agent optimization in an automatic call distributor according to claim 36 wherein the call contact speech recognizer performs recognition selected from the group consisting of continuous-speech recognition, contiguous-words recognition and isolated-words recognition.
 38. The virtual agent for live agent optimization in an automatic call distribution system according to claim 36 wherein the call contact speech recognizer is selected from the group consisting of a text-dependent recognizer and a text-independent recognizer.
 39. The virtual agent for live agent optimization in an automatic call distribution system according to claim 36 wherein the call contact speech recognizer has a recognition vocabulary including the valid response.
 40. The virtual agent for live agent optimization in an automatic call distribution system according to claim 36 wherein the call contact speech recognizer is a speaker-dependant system.
 41. The virtual agent for live agent optimization in an automatic call distribution system according to claim 36 further comprising: a router that routs the call to the live agent if the call contact voice input is not the valid response.
 42. The virtual agent for live agent optimization in an automatic call distribution system according to claim 41 wherein routing the call to a live agent includes asking the call contact to repeat the call contact voice input that was not recognized as the valid response.
 43. The virtual agent for live agent optimization in an automatic call distributor according to claim 41 wherein routing the call to a live agent includes sending the prepared script to the live agent.
 44. The virtual agent for live agent optimization in an automatic call distributor according to claim 36, further comprising: a plurality of valid responses associated with the prepared script; and an appropriate action associated with each valid response.
 45. A computer program product embodied in a computer readable medium and providing a virtual agent for live agent optimization in an automatic call distribution system, the computer program product comprising: a speech synthesis code segment that takes a script as input and generates synthesized speech as output in response thereto; an initialization code segment that provides a prepared script, for use as input for the speech synthesizer code segment; a live agent voice input code segment; an agent speech recognition code segment that makes an extemporaneous script by recognizing the live agent voice input; and a speech synthesis code segment that transmits to a call contact the synthesized speech generated using the prepared script as input and speech generated using the extemporaneous script as input to the synthesized code segment.
 46. The virtual agent for live agent optimization in an automatic call distribution system according to claim 45 wherein the speech synthesis code segment includes code to perform text-to-speech synthesis.
 47. The virtual agent for live agent optimization in an automatic call distribution system according to claim 45 wherein the agent speech recognition code segment includes code to perform speech-to-text conversion.
 48. The virtual agent for live agent optimization in an automatic call distribution system according to claim 45 wherein the agent speech recognition code segment includes code to perform speaker dependent recognition.
 49. The virtual agent for live agent optimization in an automatic call distribution system according to claim 45 wherein the agent speech recognition code segment includes instructions to perform recognition selected from the group consisting of continuous-speech recognition, contiguous-words recognition and isolated-words recognition.
 50. The virtual agent for live agent optimization in an automatic call distribution system according to claim 45, wherein the speech synthesis code segment further comprises: a code segment that records different pronunciations from a professional voice; a code segment that saves the different pronunciations recorded from a professional voice; and a code segment that uses the different pronunciations recorded from a professional voice to synthesize speech.
 51. The virtual agent for live agent optimization in an automatic call distribution system according to claim 45, further comprising: a call contact speech recognition code segment; the initialization code segment providing a valid response associated with the prepared script; the initialization code segment providing an appropriate action associated with the valid response; a call contact voice input code segment that receives voice input from the call contact in response to the prepared script; an evaluation code segment that compares the call contact voice input to the valid response using the call contact speech recognition code segment; and a control code segment that performs the appropriate action associated with the valid response if the call contact voice input is recognized as the valid response,
 52. The virtual agent for live agent optimization in an automatic call distribution system according to claim 51 wherein the call contact speech recognition code segment performs recognition selected from the group consisting of continuous-speech recognition, contiguous-words recognition and isolated-words recognition.
 53. The virtual agent for live agent optimization in an automatic call distribution system according to claim 51 wherein the call contact speech recognition code segment is selected from the group consisting of a text-dependent recognizer and a text-independent recognizer.
 54. The virtual agent for live agent optimization in an automatic call distribution system according to claim 51 wherein the call contact speech recognition code segment has a recognition vocabulary including the valid response.
 55. The virtual agent for live agent optimization in an automatic call distribution system according to claim 51 wherein the call contact speech recognition code segment is a speaker-dependant system.
 56. The virtual agent for live agent optimization in an automatic call distribution system according to claim 51 further comprising: a routing code segment that sends the call to the live agent if the call contact voice input is not the valid response.
 57. The virtual agent for live agent optimization in an automatic call distribution system according to claim 56 wherein the routing code segment asks the call contact to repeat the call contact voice input that was not recognized as the valid response.
 58. The virtual agent for live agent optimization in an automatic call distribution system according to claim 56 wherein the routing code segment sends the prepared script to the live agent.
 59. An agent assist method of automatic call handling in an automatic call distribution system, the method comprising: synthesizing speech using a prepared script as input and generating as output synthesized speech that sounds like a live agent's natural voice; connecting a call with a call contact; speaking the synthesized speech generated using the prepared script to the call contact; receiving call contact voice input; transmitting the call contact voice input to the live agent; accepting direction from the live agent indicating a response to the call contact's voice input.
 60. The agent assist method of automatic call handling according to claim 59 wherein the response to the call contact's voice input comprises providing a second prepared script.
 61. The agent assist method of automatic call handling according to claim 59 wherein the response to the call contact's voice input comprises connecting the call to the live agent for the live agent to converse with the call contact.
 62. The agent assist method of automatic call handling according to claim 59 further comprising: monitoring background noise surrounding the live agent; and the introducing background noise into the synthesized speech to emulate the live agent's environment.
 63. The agent assist method of automatic call handling according to claim 59 wherein the live agent is free to work on other tasks during the telephone call.
 64. An agent assist system to assist a live agent with automatic call handling in an automatic call distribution system, the agent assist system comprising: a speech synthesizer that uses a prepared script as input and generates as output synthesized speech that sounds like a live agent's natural voice; a network that connects a call with a call contact; a communications channel that transmits to the call contact the synthesized speech generated using the prepared script, that receives voice input from the call contact, and that communicates voice input from the call contact to the live agent; a controller that accepts direction from the live agent indicating a response to the call contact's voice input.
 65. The agent assist system according to claim 64 wherein the controller enables the live agent to indicate a second prepared script for use in the call.
 66. The agent assist system according to claim 64 wherein the controller enables the live agent to speak directly with the call contact.
 67. The agent assist system according to claim 64 further comprising: a detector to monitor background noise surrounding the live agent; the speech synthesizer introducing background noise into the synthesized speech to emulate the live agent's environment.
 68. The agent assist system according to claim 64 wherein the live agent is free to work on other tasks during the telephone call.
 69. An agent assist system to assist a live agent with automatic call handling in an automatic call distributor, the system comprising: means for synthesizing speech that generates from a script output synthesized speech that sounds like a live agent's natural voice; means for connecting a call with a call contact; means for speaking to the call contact speech generated using a prepared script as input to the speech synthesizer; means for receiving call contact voice input; means for transmitting the call contact voice input to the live agent; means for accepting direction from the live agent indicating a response to the call contact's voice input.
 70. The agent assist system according to claim 69 wherein the response to the call contact's voice input comprises providing a second prepared script.
 71. The agent assist system according to claim 69 wherein the response to the call contact's voice input comprises connecting the call to the live agent for the live agent to converse with the call contact.
 72. The agent assist system according to claim 69 further comprising: means for monitoring background noise surrounding the live agent; and the speech synthesizer introducing background noise into the synthesized speech to emulate the live agent's environment.
 73. The agent assist system according to claim 69 wherein the live agent is free to work on other tasks during the telephone call.
 74. A computer program product embedded in a computer readable medium, the computer program product assisting a live agent handling calls, the computer program product comprising: a computer readable medium containing computer program code segments comprising a speech synthesis code segment that generates from a proposed script output synthesized speech that sounds like a live agent's natural voice; an initialization code segment that provides the prepared script; a connecting code segment that places a call to a call recipient; a communications code segment that controls transmission to the call contact the synthesized speech generated using the prepared script as input to the speech synthesizer; receiving call contact voice input; and transmitting the call contact voice input to the live agent; and a controller code segment that accepts direction from the live agent indicating a response to the call contact's voice input.
 75. The computer program product embedded in a computer readable medium according to claim 74 wherein the response to the call contact's voice input comprises providing a second prepared script.
 76. The computer program product embedded in a computer readable medium according to claim 74 wherein the response to the call contact's voice input comprises connecting the call to the live agent for the live agent to converse with the call contact.
 77. The computer program product embedded in a computer readable medium according to claim 74 wherein the communications code segment monitors background noise surrounding the live agent; and the speech synthesizer code segment introduces background noise into the synthesized speech to emulate the live agent's environment.
 78. The computer program product embedded in a computer readable medium according to claim 74 wherein the live agent is free to work on other tasks during the telephone call. 